[人人能懂] 从科学预测、大道至简到团队协作
Description
想知道为什么教机器人玩最“笨”的玩具,反而能让它学会抓取任何东西吗?本期节目,我们将一起探索如何将神秘的AI“炼金术”变成一门严谨的科学,看看怎样让AI大神学会“说人话”并带得动AI小白,并最终揭示,那些五花八门的调教秘籍背后,其实藏着同一个简单的目标。让我们马上进入今天的前沿速递!
00:00:28 AI大模型调教指南:从玄学到科学
00:05:39 返璞归真:最笨的方法,可能就是最好的方法
00:11:25 想让机器人变聪明?先教它玩“笨”玩具
00:16:41 如何让AI大神,带得动AI小白?
00:00 大模型调教秘籍:条条大路通罗马?
本期介绍的几篇论文:
[LG] The Art of Scaling Reinforcement Learning Compute for LLMs
[Meta & UT Austin & UC Berkeley]
https://arxiv.org/abs/2510.13786
---
[RO] VLA-0: Building State-of-the-Art VLAs with Zero Modification
[NVIDIA]
https://arxiv.org/abs/2510.13054
---
[RO] Learning to Grasp Anything by Playing with Random Toys
[UC Berkeley]
https://arxiv.org/abs/2510.12866
---
[LG] Tandem Training for Language Models
[Microsoft & EPFL & University of Toronto]
https://arxiv.org/abs/2510.13551
---
[LG] What is the objective of reasoning with reinforcement learning?
[University of Pennsylvania & UC Berkeley]
https://arxiv.org/abs/2510.13651